Automatic label generation for news comment clusters

نویسندگان

  • Ahmet Aker
  • Monica Lestari Paramita
  • Emina Kurtic
  • Adam Funk
  • Emma Barker
  • Mark Hepple
  • Robert J. Gaizauskas
چکیده

We present a supervised approach to automatically labelling topic clusters of reader comments to online news. We use a feature set that includes both features capturing properties local to the cluster and features that capture aspects from the news article and from comments outside the cluster. We evaluate the approach in an automatic and a manual, task-based setting. Both evaluations show the approach to outperform a baseline method, which uses tf*idf to select comment-internal terms for use as topic labels. We illustrate how cluster labels can be used to generate cluster summaries and present two alternative summary formats: a pie chart summary and an abstractive summary.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Graph-Based Approach to Topic Clustering for Online Comments to News

This paper investigates graph-based approaches to labeled topic clustering of reader comments in online news. For graph-based clustering we propose a linear regression model of similarity between the graph nodes (comments) based on similarity features and weights trained using automatically derived training data. To label the clusters our graph-based approach makes use of DBPedia to abstract to...

متن کامل

Column generation approach for the point-feature cartographic label placement problem

This paper proposes a column generation approach for the Point-Feature Cartographic Label Placement problem (PFCLP). The column generation is based on a Lagrangean relaxation with clusters proposed for problems modeled by conflict graphs. The PFCLP can be represented by a conflict graph where vertices are positions for each label and edges are potential overlaps between labels (vertices). The c...

متن کامل

Socially-Informed Timeline Generation for Complex Events

Existing timeline generation systems for complex events consider only information from traditional media, ignoring the rich social context provided by user-generated content that reveals representative public interests or insightful opinions. We instead aim to generate socially-informed timelines that contain both news article summaries and selected user comments. We present an optimization fra...

متن کامل

W-kmeans: Clustering News Articles Using WordNet

Document clustering is a powerful technique that has been widely used for organizing data into smaller and manageable information kernels. Several approaches have been proposed suffering however from problems like synonymy, ambiguity and lack of a descriptive content marking of the generated clusters. We are proposing the enhancement of standard kmeans algorithm using the external knowledge fro...

متن کامل

Predicting the Volume of Comments on Online News Stories (Abstract)

On-line news agents provide commenting facilities for readers to express their views with regard to news stories. The number of user supplied comments on a news article may be indicative of its importance or impact. We report on exploratory work that predicts the comment volume of news articles prior to publication using five feature sets. We address the prediction task as a two stage classific...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016